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H . Abstract 

& . 

The dot-depth hierarchy is a classification of star-free languages. It is related to the quan- 
tifier alternation hierarchy of first-order logic over finite words. We consider fragments of 
C , languages with dot-depth 1/2 and dot-depth 1 obtained by prohibiting the specification 

of prefixes or suffixes. As it turns out, these language classes are in one-to-one corre- 
spondence with fragments of existential first-order logic without min- or max-predicate. 
["T i ' For all fragments, we obtain effective algebraic characterizations. Moreover, we give new 

^ , combinatorial proofs for the decidability of the membership problem for dot-depth 1/2 

Q ' and dot-depth 1. 

1 Introduction 

m [ 

The dot-depth hierarchy B n for n £ N + {1/2, 1} has been introduced by Cohen and Brzo- 
zowski [2]. A very similar hierarchy is the Straubing-Therien hierarchy C n , see |16[ 118], Both 
hierarchies are strict [1] and they are exhausting the class of star- free languages. A classical 
result of McNaughton and Papert is that a language is star-free if and only if it is defin- 
able in first-order logic [9]. Thomas |20| has tightened this result by showing that there is a 
one-to-one correspondence between the dot-depth hierarchy (and also between the Straubing- 
Therien hierarchy) and the quantifier alternation hierarchy of first-order logic. More precisely, 
^ the dot-depth hierarchy is related to the quantifier alternation hierarchy over the signature 

[<, +1, min, max], whereas the Straubing-Therien hierarchy corresponds to the quantifier al- 
ternation hierarchy over the signature [<]. 

Schiitzenberger has shown that a language is star- free if and only if its syntactic semigroup 
is aperiodic [13j. The latter property is effectively decidable. Together with the result of 
McNaughton and Papert, this yields a decision procedure for definability in first-order logic. 
Effectively determining the level of a language in the dot-depth hierarchy or equivalently, in 
the quantifier alternation hierarchy of first-order logic, is one of the most challenging open 
problems in automata theory. For n £ N, Straubing has shown that membership in B n is 
decidable if and only if membership in C n is decidable [17J. This result has been extended 
to the half-levels by Pin and Weil [12J. Simon has shown that the class of piecewise testable 
languages L\ is decidable |14j . Later, Knast [6] gave an effective algebraic characterization of 
B\. Decidability of £1/2 was shown by Pin |1U| . and the levels Z3i/2 an d £3/2 are decidable 
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by a result of Pin and Weil The most recent decidability result is for B3/2 due to Glafier 
and Schmitz [3]. To date, no other levels are known to be decidable. 

In this paper, we focus on subclasses of B\/2 and B\. For both B\/2 and B\ we give 
new proofs for their effective algebraic characterizations. The proof of Pin and Weil for 
B\/2 is based on factorization forests |15| . and the proof of Knast [6] as well as the simplified 
version of Therien |19| for B\ are based on a generalization of finite monoids, so-called finite 
categories |21j. Our proofs are more combinatorial than algebraic. The proof for B\ is a 
generalization of Klfma's proof [5] for C\. The main advantage of our proofs for B\/2 and B\ 
over previous ones is that the constants involved in finding language descriptions for given 
algebraic objects are more explicit (and therefore smaller). 

Our main original contributions are effective algebraic characterizations of fragments of ex- 
istential first-order logic over the signatures [<,+l,min] without max-predicate, [<,+l,max] 
without min, and [<,+l] without min and max. These fragments also admit language char- 
acterizations in terms of subclasses of B\i2 an( i ^l- The corresponding language classes are 
obtained by prohibiting the specification of prefixes or suffixes. In contrast to #1/2 an d B\, 
the resulting subclasses do not form (positive) varieties of languages, but they still can be 
described using so-called lattice equations [3]. Moreover, there is a tight connection with Can- 
tor topologies over finite words [7J- A more detailed overview of our results can be found in 
Section [JJ 

2 Preliminaries 

Words and Languages Let r be a finite non-empty alphabet. The set of finite words is r* . 
By e we denote the empty word and r + = T* \ {e} is the set of finite non-empty words. The 
length of a word u E r* is \u\ and its alphabet is alph(u) = {a £ T \ u E r*a,r*}. Similarly, 
alph fc (u) = {v E r k J u E r*vT*} is the set of all factors of u of length k. A word v E r* is 
a prefix (resp. suffix, resp. factor) of u if u E vT* (resp. u E r*v, resp. u E r*vT*). We write 
v < p u if v is a prefix of u and v < p u if v is a proper prefix of u. A quotient of L C r + is 
a language of the form u~ l L = {v E -T + | or Lu^ 1 = {v E -T + | vu £ L} for a£f*. 

A language L is a monomial (of degree m) if L = w±r*W2 ■ ■ ■ r*w n or L = wir + W2 • • • r + w n 
for some n > and w%, . . . , io n E r* (with \w± ■ ■ ■ w n \ < m). A language has dot-depth one 
if it is a Boolean combination of monomials. Throughout this paper, Boolean operations are 
complementation, finite union, and finite intersection. Positive Boolean operations are finite 
union and finite intersection. 

First-order Logic over Words We consider the first-order logic FO = FO[<, +1, min, max] 
over finite words. We view words as sequences of labeled positions which are linearly ordered 
by <. Variables are interpreted as positions of a word. For variables x, y we have the following 
atomic formulas: x < y says that x is a position smaller than y; and x = y + 1 is true if x is the 
immediate successor of y; the formula min(x) (resp. max(i)) holds if x is the first (resp. last) 
position. Moreover, we always assume that we have an atomic formula T (for true), equality 
of positions x = y, and a predicate A(x) = a specifying that position x is labeled by a E r. 
Formulas can be composed using Boolean operations, existential quantification, and universal 
quantification. The semantics is as usual. A sentence is a formula without free variables. For 
a sentence ip of FO we write u \= (p if u is a model of <p and the language defined by p is 
L(cp) = {u E T+ I u \= p}. 
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The fragment E\ consists of all FO-formulas in prenex normal form with only one block 
of quantifiers and these quantifiers are existential. Let C C {<, +1, min, max}. By S±[C] we 
denote the class of formulas in U± which only use predicates in C, equality, and the label 
predicate. The fragment BZi[C] comprises all Boolean combinations of formulas in Zi[C]. 

Finite Semigroups and Recognizable Languages Let S be a semigroup. An ele- 
ment x G S is idempotent if x 2 = x. The set of idempotents is denoted by E(S) = 
{eGS'l e 2 =e}. For every finite semigroup S there exists a number oj > 1 such that for 
every x G S, the power x w is the unique idempotent element generated by x. Frequently, we 
consider words u,v G S* where the alphabet is a semigroup. We write u u = v in S"' if either 
u = e = v or u,v£ S + evaluate to the same element of S. 

Lemma 1. Let S be a finite semigroup. For every word u G S + with length \u\ = \S\ there 
exists a non-empty prefix p of u and an idempotent e G E(S) such that pe = p in S. 

Proof. Let a G S be arbitrary and let p\ < p ■ ■ ■ < P P\s\ <p PlSI+i = ua be the non-empty 
prefixes of ua. By the pigeonhole principle, there exist 1 < i < j < \S\ + 1 such that pi = pj 
in S. In particular, i < \S\ and pi is a prefix of u. Let piq = pj for q G S + . We set e = q u to 
be the idempotent element generated by q. Now, pe = p in S for p = pi. □ 

Green's relations are an important tool in the study of semigroups. They are defined as 
follows. Let x <ti y (resp. x <c y, resp. x <j y) if there exist s, t G S U {1} such that x = yt 
(resp. x = sy, resp. x = syt). Let x 1Z y (resp. x £ y, resp. x J y) if x <n y and y <n x 
(resp. x <c y and y <£ x, resp x <j- y and y <^ x). Here, S U {1} is the monoid obtained 
by adding a new neutral element 1 to the semigroup S. The relations <£, and <j- are 
preorders on S 1 ; and TZ, C, and J7" form equivalence relations. 

Let < be a preorder on S. A set P C 5 is a <-order ideal if x < y G P implies x G P. 
The order ideal generated by some subset P C S 1 is LP = {x G S 1 | x < y for some y G P}. An 
ordered semigroup S is equipped with a compatible partial order <, i.e., if p < q and s < t, 
then ps < gt. Every semigroup is an ordered semigroup with equality as partial order. A 
language L C P + is recognized by an ordered semigroup S if there exists a homomorphism 
/i : P + — > S 1 such that L = h^ 1 (P) for some <-order ideal P. If the order of S is equality, 
then we obtain the usual notion of recognition. For a language L C P + the syntactic preorder 
<L over P + is given by x <l y if uyt; G L =^ uxv G L for all u,v G P*. The syntactic 
congruence =l is defined by x =£, y if both x <l y and y <l x. The equivalence classes of 
the syntactic congruence equipped with the canonical composition constitutes the syntactic 
semigroup Synt(L) and the preorder <l of P + becomes a compatible partial order for Synt(L). 
The syntactic semigroup of L is finite if and only if L is regular and moreover, L is recognized 
by its syntactic semigroup. By Jx^yx" < x^J we denote the class of finite ordered semigroups 
S such that x w yx w < x w for all elements x,y G S. We let Bi be the class of finite semigroups 
S such that (ex / 'yY 'ex f '(tes f) u = {ex f y) u es f {tes fY for all idempotents e, f G E(S) and all 
elements x, y, t, s G S. Let LR be the class of finite semigroups S such that (exeye) u exe = 
(exeyeY for all idempotents e G E(S) and all elements x,y G 5. We have the following 
inclusions among these classes of semigroups. 

Lemma 2. We have {x^yx" < x w \ C Bi C LR. 
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Proof. For a semigroup S G Jx'^yx' 1 ' < x^J we have / > fy{exfy) UJ ~ 1 esf for all x, y, s G S and 
all idempotents e, f £ S. Hence (exfy) UJ exf(tesf) u ' > {exfy) u) ex{fy{exfy) UJ ~ 1 esf){tesf) u = 
{ex f y) u es f {tes f) u . By symmetry {ex f y) u ex f {tes f) w < {ex f y)^ es f {tes f)^ proving the first 



We have {exeyfexe = {exey) u 'exe{eeee) u for all x and all idempotents e and for a semi- 
group in Bi this is equal to {exey) 1 ^ eee{eeee) UJ = (exeye)^. This shows the second inclu- 



A language L C r + has dot-depth 1/2 if it is a positive Boolean combination of monomials 
w\r*W2 ■ ■ ■ r*w n with Wi G r* . By a result of Thomas |20| . a language has dot-depth 1/2 if 
and only if it is definable in existential first-order logic Si[<, +1, min, max]. Pin and Weil |llj 
have shown that L has dot-depth 1/2 if and only if Synt(L) G [x^yx^ < x^J. In this section, 
we give a new proof of these equivalences. The main step in the proof is to show that if L 
is recognized by some homomorphism h : r + — > S 6 \x u yx u < x w ]], then L is a union of 
monomials w\r*W2 • • • r*w n . The main advantage of our proof is that the degree \w\ ■ ■ ■ w n \ 
is polynomially bounded (Proposition [9]) , whereas in the proof of Pin and Weil, the bound is 
exponential. 

Theorem 3 (Thomas |20j . Pin/Weil |11|). Let L C 7 n +. The following assertions are equiva- 
lent: 

1. L is definable in Si[<, +1, min, max]. 

2. L is a finite union of monomials wir*W2 ■ ■ ■ r*w n . 

3. L is a positive Boolean combination of monomials ui\r*u)2 ■ ■ ■ r*w n . 

4. Synt(L) G [x w yx w < x% 

5. There exists a homomorphism h : T + — > S with S € [x^yx" < x u j such that L = h- 1 {P) 
for some <-order ideal P. 

In the remainder of this section we prove the above theorem. 

Lemma 4. Let n > 0, and let w±, . . . ,w n G r*. 

1. The monomial wir*W2 ■ ■ ■ r*w n is definable in Si[<, +1, min, max]. 

2. The monomial w\r*W2 - ■■ r*w n r* is definable in Si[<, +1, min]. 

3. The monomial r*w\r*W2 ■ ■ ■ r*w n r* is definable in Ui[<, +1]. 

Proof. The proof is straightforward. For variable vectors x = {x\, . . . , xg) and y = (yi, . . . , y m ) 
we use the shortcuts 3x for 3xi • • • 3x£, and min(x) for min(xi) and max(x) for max(x^), and 
x < y means xi < y\. Moreover, A(x) = a\ ■ ■ ■ a& is a shortcut for 



inclusion. 



sion. 



□ 



3 Dot-depth 1/2 





i<j<fc 



l<j<k 



Let L = r*wir*u)2 ■ ■ ■ r*w n r* . We introduce variable vectors Xj = {xn, . . . 
i G {1, . . . , n}. Then, L is defined by the following sentence ip: 



x i,\wi\) for ever y 




l<j<n 



l<i<n 
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The first term of the conjunction ensures that each Xj corresponds to a factor Wi, whereas 
the second term ensures that the factors Wi occur in the correct order. The sentence for 
w\r*W2 ■ ■ ■ r*w n r* is <p A min(x 1 ) and the sentence for wir*W2 ■ ■ ■ r*w n is <p A min^) A 
max(x n ). □ 

Lemma 5. Let L C r + be definable by a sentence <p G S\[<, +1, min, max] with m variables. 
Then L is a finite union of monomials wir*W2 • • • r*w n of degree at most m. 

Proof. Let ip = 3x\ ■ ■ ■ x m : ip for a propositional formula ip. Suppose (u, x±, . . . , x m ) \= ip for 
positions X{ of u. We say that a position j of u is marked if j = x% for some i. In order to avoid 
case distinctions we can introduce two new variables such that the first and the last position 
of u are marked. Let u = W1U1W2 ■ ■ ■ u n -\w n for m G -T + such that the factors Wi consist of 
the marked positions. Now, P u = wir + u)2 ■ ■ ■ r + w n is a monomial of degree \w\ ■ ■ ■ w n \ < m 
with 11 G P u . Moreover, P u C L(tp) since the satisfying assignment of u can be adapted to all 
v € P u . It follows L(<p) = (Jn^iys ^ u ano - ^ ms umon i s finite since there are only finitely many 
monomials of degree at most m. □ 

Lemma 6. Let L C r + be a finite union of monomials wiP*ui2 • • • r*w n . Then Synt(L) G 

Proof. Let P = w\r*W2 • • • r*w n and let u,x,y,v € T + and choose m such that \x m \ > 
\w\ ■ ■ ■ w n \. Suppose ux m v G P. Let i be maximal such that ux m G ui\P*W2 • • ■ r*WiP* = Qi 
and let j be minimal such that x m v G r*Wj • • • r*w n = Rj. By the choice of m we have 
j < i + 1. Hence, ux m yx m v G QiRj CP. □ 

Lemma 7. Let S be a finite semigroup. For every w G S + there exists a factorization 
w = xxwxyi ■ ■ ■ x m w m y m s with 

1. Wi,seS*, Xi,yi£S + , \yi\<\S\, 

2. < m < \S\ and \x\y\ ■ ■ ■ x m y m s\ < 2 |5| 2 + \S\, 

3. \/ i G {1, . . . , m} 3 a G E(S) : X; L = x^e-i in S and yi = yiei in S. 

Proof. For w G S* , let E(w) be the set all e G E(S) such that there exists a factor x G S + of u> 
with \x\ < \S\ and xe = x in 5. We prove the existence of the factorization by induction on 
with the stronger assertions that m < \E(w) \ and \xiyi ■ ■ ■ x m y m s\ < 2 \S\ \E(w) \ + \S\ 
instead of condition '{2]'. Suppose = 0. By Lemma [1] we have \w\ < \S\. Hence, we can 

choose m = and s = w. 

If \E(w) \ > 1, then Lemma [1] yields a non-empty prefix xofw with \x\ < IS 1 ) such that xe = 
x in S for some idempotent e G E(S). Write u) = xw' . We have to distinguish two cases. The 
first case is e G" E(w'). By induction, there exists a factorization w' = xiW\yx ■ ■ ■ x m w m y m s 
with m < \E(w')\ < \E(w)\ and | 

x Wi ' ' ' x m y rn s\ < 2 (S 1 ) + \S\ satisfying conditions 

'U' and "[3]'. If m > 1, then we set x[ = xx\. Now, w = x'lWiyi ■ ■ ■ x m w m y m s is a desired 
factorization of w. If m = 0, then the factorization for is w = s' with m = 0. 

The second case is e G E(w'). Let w' = woyow" such that yo G <S' + , 1 2/0 1 — \S\, y$e = yo 
in S and e E(w"), i.e., we take yo as the last short factor of w' such that it is stabilized 
by e. By induction, there exists a factorization w" = xiWiyi ■ ■ ■ x m w m y m s. Now, w = 
x o w oUo ' ' ' x m w m y m s with xq = x is a factorization of w of the desired form. □ 
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Lemma 8. Let S £ LR be a finite semigroup. Let u, x £ S" + and e £ S be idempotent such 
that u = ue and x = xe in S. If ux 7Z u, then ux = u in S. 

Proof. Let y 6 S* such that uxy = u. In S we have u = u{exeye) u = u(exeye) U) exe = ux 
where the second equality follows because 5 S LR. □ 

Proposition 9. Let L C r + be recognized by S £ [x^yx" < x"]. XTten L is a union of 
monomials wir*W2 • • • r*w n of degree \w1W2 • • • w n \ < 2 \S\ 3 + \S\ 2 and n < \S\ 2 . 

Proof. Let h : r + — > S be a homomorphism recognizing L. We define the depth of the 
word u £ .T + as = |{s € S 1 | h(u) <n s}\. For each u 6 .T + we construct a language 

-Pu = wir*ui2 • • • r*w n with |u>it/;2 • • • u> ra | < 2d(u) \S\ 2 + \S\ such that 

« e P u c /r 1 ^!*)). 

In order to avoid unnecessary case distinctions, we set P e = e and h(e) >n h(u) for all u € P + . 
Let u = vw, v £ r* , w £ ar* such that h(v) >n h(va) TZ h(u). Now, d(v) < d{u) and hence by 
induction, there exists a monomial P v with «eP„C of degree less than 2d(u) \S\ 2 + 

d(u) \S\ - 2 \S\ 2 - \S\. By Lemma we find a factorization u; = XiUiyi ■ ■ ■ x m u m y m s such 
that \x\yi ■ ■ ■ x m y m s\ < 2|5| 2 + \S\ and for all i £ {l,...,m} there exists an idempotent 
ei with h(xi)ei = h(xi) and h(yi)ei = h(yi). Using Lemma [8] we see h(u) = h(vw) = 
h{vx\ ■ ■ ■ x m s). Now, define the monomial P u = P v x\r*y\X2r* ■ • • y m -\x m P*y m s of degree 
less than 2d(u) \ S\ 2 + d{u) \ S\. By construction u S P u . Consider v'w' € P u with v' G P„ and 
«/ = 217x4 y\X2w' 2 ■ ■ ■ y m -ix m w' m y m s. We have /i(V) < h(v) and since ese < e for all s G 5 
and all e £ P(S0 we see that = h{xi)e.i > h{xi)eih{w' i yi)ei = h(xiw' i yi). Therefore, 

• • • x m s) > and h{u) = h{vx\ ■ ■ ■ x m s) > h{v'w'). 

Now with the above properties, L C IJueL Pu ^ L and this union is finite since there are 
only finite many monomials of degree less than 2 |5| 3 + \S\ 2 . □ 

We are now ready to prove Theorem (3) 

Proof (Theorem [3^. 'Q]=^[2]': This is Lemma [5) l f2]=^[I]' follows from property 'QJ of 
Lemma U] and the fact that S\[<, +1, min, max] is closed under disjunction. 

The implication l f2]=>[3]' is trivial, and "[3]=>|3]' is Lemma [6] since the class of languages 
recognizable by semigroups in \x u yx u < x^J is closed under positive Boolean combinations. 
'SI =>[5]' is trivial. Finally, 't5]=>[2]' follows immediately from Proposition [9j □ 



4 Existential First-order Logic without min or max 

At higher levels of the quantifier alternation hierarchy, it is possible to specify the prefix and 
the suffix of a word by using successor +1 as the only predicate (apart from labels A(x) = a 
for a 6 P). At the level E\, the min-predicate is required to determine prefixes, and max is 
required for suffixes. We have the following inclusions: 

Z'i[<,+l,min] 

U\ [<, +1, min, max] 

£i[<, +l,max] 



Si[<] C S 1 [<,+1] 



C 



ST 
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By a result of Pin [10] , it is decidable whether a given regular language is definable in I7i[<]. 
For £i[<, +1, min, max], decidability follows by a result of Pin and Weil (or alternatively 
by Theorem [3|) . In this section, we characterize the languages definable in the other fragments 
and we show that definability within these fragments is decidable. As it turns out, these 
decidability results are a combination of effective algebraic and effective topological properties, 

Theorem 10. Let L C r + . The following assertions are equivalent: 

1. L is definable in £i[<, +1, min]. 

2. L is a finite union of monomials wiT* ■ ■ ■ w n r* . 

3. Synt(L) G fx^yx^ < x^J and Hl(L) is a <n-order ideal. 

Proof. '1I]=>[2': Let L = L(ip) for ip G <Ei[<, +1, min]. By Theorem El the language L 
is a finite union of monomials w±r*W2 • • • r*w n . Let u \= <p. Then for every v G T* the 
same assignment of the variables which makes <p true on u also satisfies <p on uv. Therefore, 
LP* C L. Since (P U Q)r* = PT* U Qr*, it follows that L is a finite union of monomials 

wir*w 2 ■ ■ ■ r*w n r*. 

This follows from l W in Lemma H 
'12 ^[3]': Synt(L) G \x"yx u < x w } follows from Theorem EJ By [3 Theorem 1] we see 
that hjj{L) is a <-^-order ideal. The implication '13]=^[2]' follows from Proposition [9] and 
Theorem 1]. □ 

Of course, there also is a left-right dual of the above theorem: A language L is definable 
in U\[<, +1, max] if and only if L is a union of monomials of the form r*wi ■ ■ ■ r*w n if and 
only if Synt(L) G ^x^yx^ < x^J and }il(L) is a <£-order ideal. The following theorem is the 
analogue of Theorem 1101 with neither min nor max predicates. 

Theorem 11. Let L C The following assertions are equivalent: 

1. L is definable in Zi[<,+1]. 

2. L is a finite union of monomials r*w\ ■ ■ ■ r*w n r* . 

3. Synt(L) G {x^yx^ < x^j and h L (L) is a <j-order ideal. 

Proof. 'H]=>[2]': Let L be defined by <p G 27l[<,+1]. By Theorem [3j L is a finite union 
of monomials w\P*W2 • • • r*w n . Let u,v,w G T* . Every assignment satisfying pnu also 
satisfies ip on vuw. Hence, r*LT* C L. 

'0 =^T- This follows from 'H' in Lemma H 

'12^ El': Synt(L) G {x^yx^ < x u } follows from Theorem EJ the set h L (L) is a ^-order 
ideal by [71 Theorem 3]. The implication "E]=>[2' follows from Proposition[9]and \J, Theorem 3]. 

□ 

The following decidability result is an immediate consequence of our characterizations. 

Corollary 12. Let L C T + be a regular language. It is decidable whether L is definable in 
S\[<, +1] (resp. Si[<, +1, min], resp. S%[<, +l,max]j. 

Proof. The syntactic homomorphism Hl '■ T + — > Synt(L) of L is effectively computable. 
Hence, one can verify whether property '13' in Theorem 1 1 1 1 (resp . '13]' in Theorem [TUJ resp. the 
left-right dual of "El' in Theorem [TO]) holds. □ 
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5 Dot-Depth One 



A language L C r + has dot-depth 1 if it is a Boolean combination of monomials of the form 
w\r*W2 ■ ■ ■ r*w n with Wi G r*. Knast [6] has shown that a language L has dot-depth 1 if 
and only if Synt(L) G Bi. Since the latter property is decidable, this gives decidability of 
dot-depth 1. Later, Therien [19J gave a simpler proof for Knast's result. Both proofs are based 
on an algebraic concept called finite categories, see |21j . In this section, we give a new (more 
combinatorial) proof of this theorem. As for dot-depth 1/2, the main advantage of our proof 
is that the bounds involved are more explicit. 

Theorem 13 (Thomas [20J, Knast |6j). Let L C r + . The following assertions are equivalent: 

1. L is definable in MUx[<, +1, min, max]. 

2. L is a Boolean combination of monomials w\r*W2 • • • r*w n . 

3. Synt(L) G Bi. 

4- L is recognized by some semigroup in Bi. 

As for dot-depth 1/2, the equivalence of BXi[<, +1, min, max] and dot-depth 1 is due to 
a result by Thomas [20] . The remainder of this section is devoted to the proof of the above 
theorem. 

Lemma 14. Let S be a finite semigroup and let u € S + . Suppose there exists e £ S such that 
pe = p in S for some prefix p < p u. Choose \p\ maximal with this property and let u = pv. If 
xpvy = x'pvy' for some i/i', then there is at least one letter between the factors v in the 
two factorizations. 

Proof. Let \x'\ < \x\ and assume that the claim is not true. 



x 


p 


V 


y 


x' 


p 


V 


y' 




p' 


v' 





Then we find a factorization pv = p'v' such that p'e = p' and > \p\ contradicting the 
maximality of \p\. This also holds if the factors do not overlap but are adjacent, in which case 
p' = pv. □ 

The following lemma will serve as the link between the algebraic properties of Bi and the 
combinatorial properties in Lemma [T6l below. 

Lemma 15. Let S G LR and let k > \S\ + 1. For every a G S and for all u, x G S + with 
\x\ > k we have: ulZux >n uxa =>■ alph fe (x) ^ alph fc (xa). 

Proof. Suppose ulZux and alph fc (x) = alph fc (xa). Let w be the suffix of xa of length k. By 
Lemma [H there exist p,v G S* such that w = pva in S + and pe = p in S for some idempotent 
e G S. Let \p\ < \S\ be maximal with this property. Since w G &lph k (xa) = alph fc (x) we can 
write 

x = spvatv in S + 

for some s,t G S* such that p is a suffix of pvat. Note that there is indeed at least one 
letter between the two occurrences of v by Lemma Q3J For u' = usp and x' = vat we have 
v! = u'e, u'x' = u'x'e, and v! 7Z u'x'. Using Lemma [5] we see that v! = u'x' = u'x'x' . Hence, 
u 7Z v! = u'x'x' = uxat and therefore, u 1Z uxa. □ 
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The following lemma is the main combinatorial ingredient for our proof of Knast's Theorem. 
It generalizes an idea of Klmia [5] to factors of words. The determinacy mechanism is similar 
to unary interval logic with lookaround [8j. 

Lemma 16. Let Xi, yi, Ui, u'a, Vi, £ T + and u k ,v k ,u' 1 ,v[ G r* , and let 

u = x\U\ ■ ■ ■ x k u k = u[yi ■ ■ ■ u' e yt 
v = X \V\ ■ ■ ■ x k v k = v[yi ■ ■ ■ v' e y e 

such that x±ui ■ ■ ■ x k (resp. x±v\ ■ ■ ■ x k ) is the shortest prefix ofu (resp. v) in x\P + X2 ■ ■ ■ r + x k , 
and yi ■ ■ ■ u'/y^ (resp. y\ ■ ■ ■ viyi) is the shortest suffix of u (resp. v) in y±r + y2 ■ ■ ■ r + Vt- 

If u and v are contained in the same monomials w\r + W2 • • • r Jr w n with n < k + t and 
degree \w\ ■ ■ ■ w n \ < \x\ ■ ■ ■ x^ y\ ■ • • yi\, then the relative positions of x k and y\ are the same 
in u as in v. More precisely, 

1. x\Ui • • • x k is a prefix of u\ iff X\V\ ■ ■ ■ x k is a prefix of v[, 

2. if x k and y\ overlap in u or in v, then they have the same overlap in both words, 

3. u[yi is a prefix ofxf- n fc _i iff v' x y\ is a prefix ofxf- v k ^. 

Proof. 'tH': Suppose that x\U\ • • • x k is a prefix of u^. Then u is contained in the language 
X\T + ■ ■ ■ x k r + v\ ■ ■ ■ r + V£ or in x\ • x k v\ ■ ■ ■ r + V£. Hence v is contained in one of these 
two monomials, showing that x±v\ ■ ■ ■ x k is a prefix of v[. 

'12': We can assume that none of the conditions in 'U' holds. We have to distinguish 
two cases. First, suppose that x k and y\ overlap in u such that x\U\ ■ ■ ■ x k is a prefix of 
u^yi and let z be the word comprising all positions of x k and y% in u. Then u € P = 
■ Xfc_i-T + zT + y2 • • • r + V£. Hence v € P, showing that x k and y\ in v have at most the 
same overlap as in u. 

The second case is that x k and y\ overlap in u such that x\u± ■ ■ ■ x k is not a prefix of u\y\. 
Moreover, we can assume that x\V\ ■ ■ ■ x k is not a prefix of v[yi since otherwise we are in the 
first case with u and v interchanged. Now, u is contained in P = x±r + ■ ■ ■ XiP + zP^yj ■ ■ ■ r + vg, 
where z is the factor of u comprising all Xj+i, . . . , x k which are overlapping (or adjacent) with 
yi and all y±, . . . , yj—i which are overlapping (or adjacent) with x k . Since v G P, we conclude 
that x k and y\ in v have at least the same overlap as in u. 

'12]': If none of the conditions in 'U' and '12]' holds, then in both words u and v, the factor 
yi is on the left-hand side of x k . □ 

Lemma 17. Let S G Bi. For all u,v,x,s G S and for all e, f G E(S), the following 
implication holds: u 7Z uexf, esfv C v uexfv = uesfv. 

Proof. Since u 1Z uexf and v C esfv, there exist y,t G M with u = uexfy and v = tesfv. 
In particular, u = u{exfy) u and v = (tesf^v. We conclude 

uexfv = u(exfy) w exf(tesf) u v = u(ex j 'y)^ 'es f '(tes f) u 'v = uesfv, 

where the second equality uses 5 G Bi. □ 

Proposition 18. Let L C r + be recognized by h : r + — > S with S G Bi. If words u 
and v are contained in the same monomials W\P + W2 ■ ■ ■ P + w n with n < 2\S\ and degree 
\wi ■ ■ -w n \ < 4 |S| 2 — 2 \S\, then h(u) = h(v). 
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Proof. This proof was inspired by Klfma's proof [5] of Simon's Theorem on piecewise-testable 
languages. The outline of our proof is as follows. We consider factorizations induced by the 
7^,-factorization of u and the /^-factorization of v. Then we transfer the factorization of u to v 
and vice versa such that the respective orders of the factor in u and v are the same. Finally, 
we transform u into v by a sequence of /i-invariant substitutions. 
Consider the 7^-factorization u = a±ui - ■ ■ a^Uk such that 

h(a\Ui ■ ■ ■ a,j) 1Z h(a\Ui ■ ■ ■ ajUj) >-ji h(a\Ui ■ ■ ■ ajiijCij+i) 

for all i. We have k < \S\. Let ji be the position of aj in the above factorization. We color 
all positions of u in any of the intervals [ ji — \S\ ; ji + \S\ — 1 ] red. In particular, the ap- 
positions ji are red. And in general, there is a neighborhood of size 2\S\ around each Oj which 
contains only red positions. In the worst case, a\ is the only exception. Hence, there are at 
most 2|5| 2 — IS"! red positions in u. Let Ri be the i-th consecutive factor of red positions. 
Then u = R^u'^ ■ ■ ■ Rk'u' k , for some £ r + , i < k' , and u' k , £ r*. Note that k' < k since 
some intervals could overlap. By Lemma PT5l the word Riu'i ■ ■ ■ Ri is the shortest prefix of u 
contained in R\r + ■ ■ ■ Ri- 

Symmetrically, we consider the ^-factorization v = V\bi ■ ■ ■ Vfbn such that 

h(b i+1 Vibi ■ ■ ■ vxbi) < c h(vibi ■ ■ ■ vibi) C h(bi ■ ■ ■ vih) 

for all i. Let j[ be the position of b{ in the above factorization. We color all positions of v in 
any of the intervals [j[ — \S\ + 1; j[ + |5*| ] blue. As before, there are at most 2 IS") 2 — \S\ blue 
positions. Let Bi be the i-th consecutive factor of blue positions. Then v = v[Bi ■ ■ ■ v'^Bgi for 
some v\ £ r + , i > 1 and v[ £ J 1 *. As before, Bi ■ ■ ■ v'^B^i is the shortest suffix of v contained 
in Bi---r + B £/ . 

Next, we transfer the red positions of u to v, and we transfer the blue positions of v to u. By 
assumption, v £ R\r + ■ ■ ■ R^r + . Therefore, there exists a factorization v = R\v'l ■ ■ ■ Ryv'y 
such that Riv" ■ ■ ■ Ri is the shortest prefix of v contained in R\r + ■ ■ ■ Ri. We color the 
positions of the R^s in v red. Similarly, there exists a factorization u = u"Bi ■ ■ ■ u'^B^ such 
that Bi - ■ ■ u'LBfii is the shortest suffix of u contained in Bi ■ ■ ■ r + B^. We color the positions 
of the Bi's in u blue. Now, colored positions in u and v are either red or blue or both. By 
Lemma PT6l the colored positions in u have the same order as the colored positions in v. Let 
Wi be the i-th consecutive factor of colored (red or blue) positions, and write 

U = WlXl ■ ■ ■ W n -lX n -lW n , 
V = WlSl ■ ■ ■ W n -lS n -lW n . 

By Lemma [1] and its left-right dual, there exist e±, . . . , e n ~\ £ E(S) and /2, . . . , f n £ E(S) 
such that each Wi admits a factorization u>j = PiTiqi with \pi\ < \S\ — 1 and \qi\ < \S\ — 1 
satisfying 

h(ri) = h{ri) for 1 < i < n, 

h( r i) = fi K r i) for 1 < i < n. 

In particular, we can assume p\ = e = q n . Let x\ = qiXiPi+\ and s- = qiSiPi+i for 1 < i < n. 
Then 

u = rix'^2 ■ ■■x' n _ x r n , 
v = ris' 1 r 2 • • • s^T-n, 
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and the rj's in u cover the positions of the 1Z- factorization of u, whereas the r»'s in v cover 
the positions of the /^-factorization of v. Therefore, we have 

h{rix' 1 ■ • • Ti) TZ h{r\x'i ■ • ■ ri) ■ ei/i(a^)/i+l for all 1 < i < n, 

h{n ■ ■ ■ s' n r n ) C e i _i/i(s-_ 1 )/i • h(ji ■ ■ ■ s' n r n ) for all 1 < i < n. 

By an (n — l)-fold application of Lemma [T71 we obtain 

h(u) = h(r 1 x[ ■ ■ • r n _ 2 x^_ 2 r n _ix^„ 1 r n ) 

= h(r 1 x[ ■ ■ ■ r n _ 2 24-2 r n-l4i-l r n) 
= h(r 1 x[ ■ ■ ■ r n _ 2 S^„2 r 'n-l s n-l r 'n) 

= hirts'x ■ ■ ■ r n ^2s' n -2 r n-\s' n _x r n) = h(v). 

Note that the substitution rules x\ — > are /i-invariant in their respective contexts only when 
applied from right to left when converting h(u) into h(v). □ 

Corollary 19. Let L C r + be recognized by a finite semigroup S € Bi. If words u and v are 
contained in the same monomials w\r*W2 ■ ■ ■ r*w n with n < 2 \S\ and degree \w\ ■ ■ ■ w n \ < 
4I5"! 2 , then h{u) = h(v). 

Proof. Every monomial w\r + ■ ■ ■ w n ~\r + w n is a union of monomials of the form 

wiaiT* ■ ■ ■ w n -ia n -ir*w n 
for a±, . . . , a n _i € r. Therefore, the claim follows from Proposition [TS] □ 
We are now ready to prove Theorem [T3l 

Proof (Theorem "Q]<4>[2]': This follows from Theorem[3l 

"[2]^>[3]': By Lemma [6] the syntactic semigroup of every monomial w\r*W2 ■ ■ ■ r*w n satis- 
fies x^yx^ < x^ and by Lemma [2] it is in Bi. The claim follows since the class of languages 
recognizable in Bi is closed under Boolean combinations. The implication '|3]=>[3]' is trivial. 

'S] =^[2]': Let L be recognized by h : r + — > S G Bi. We write u = v if u and v are 
contained in the same monomials of the form ui\r*W2 • • • r*w n of degree at most 4|S'| 2 . We 
have L = for P = h(L). Corollary [T9l shows that every set h 1 (p) is a union of =- 

classes. Moreover, = has finite index since there are only finitely many monomials of bounded 
degree. Every =-class is a finite Boolean combination of the required form by specifying which 
monomials hold and which do not. □ 

6 Dot-depth One without min or max 

As for Ei, one cannot define min- or max-predicates in B27i[<,+1]. Therefore, the following 
inclusions hold: 

BZi[<,+l,min] 

BZi[<] C BZi[<,+l] BZi[<,+l,min,max] 

BZ'i[<,+l,max] 
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Simon's Theorem on piecewise testable languages |14] gives decidability of B27j[<]. For the 
fragment MSi[<, + 1, min, max], decidability follows by Knast's Theorem [BJ, see Theorem 1131 
In this section, we give effective characterizations of the remaining fragments. As for dot- 
depth 1/2, these characterizations are a combination of algebraic and topological properties, 
cf. [7]. Moreover, we obtain natural subclasses of dot-depth 1 for the languages definable by 
the above fragments. 

Lemma 20. Let P = wir*W2 ■ ■ ■ r*w n and let uq G P. Then there exists a monomial 
P' = v±r*V2 ■ ■ ■ r*v n with \v\ ■ ■ ■ v n \ < \w\ ■ ■ ■ w n q\ such that uq G P' C (Pq~ 1 )q. 

Proof. Let j < n be minimal such that q G r*Wj ■ ■ ■ r*w n . If there exists a proper prefix y 
of Wj-i such that y is a suffix of u and yq G Wj-i • • ■ r*w n , then we set q' = yq, else we 
set q' = q. We assume q' to be maximal with these properties. We can write uq = u"q'. 
Moreover, by maximality of q', there exists an index j' such that q' G ^* w j' ' ' ' r*w n and 
q' £ y'r*Wji ■ ■ ■ r*w n for any non-empty suffix y' of Wji-\. We set P' = w\P* ■ ■ ■ Wji-\P*q' ' . 
Now, uq G P' and for all w G P' we have to € P PI r*q = (Pq~~ 1 )q. □ 

Lemma 21. Lei /i : P + — > S* G Bj. If u,v G r 1 " 1 " are contained in the same monomials 
wiT* ■ ■ ■ w n r* of degree \ui\ ■ ■ ■ w n \ < 8 l^ 2 , then h(u) 1Z h(v). 

Proof. We write u = m v, if u and v are contained in the same monomials wir*W2 • • • r*U) n 
of degree \w± ■ ■ ■ w n \ < m. Analogously, we write u ~ m v if u and v are contained in the same 
monomials wiT* ■ ■ ■ w n r* of degree \w\ ■ ■ ■ w n \ < m. If u = m v for m = 4 \S\ 2 — 1, then by 
Corollary 1191 we have = h(v). 

Let w ~2m v. We want to show h{u) 7Z h(v). We can assume \v\ > 2m, because 
otherwise u = v. Let u = n'g with |g| = m. Consider the factorization v = v'qx such that qx 
is the shortest suffix of v admitting q as a factor, i.e., v is factorized at the last occurrence 
of q. This factorization exists, since u 6 r*qT* 3 v. We claim u =k v'q and therefore, 
h(v) <tz h(v'q) = h(u). Symmetry then yields h(u) TZ h(v). 

We now prove the claim. First, let v'q € P = wir*W2 ■ ■ ■ r*w n with ui\ ■ ■ ■ w n < m. Then 
v G PT* and u G PP*. Since w n is a suffix of q, we conclude u G P. 

Next, suppose u G P = wir*W2 ■ ■ ■ r*w n with \w\ ■ ■ ■ w n \ < m. By Lemma[20l there exists 
a monomial P 1 = v\P*V2 ■ ■ ■ r*v n with \vi ■ ■ ■ v n \ < \w% ■ ■ ■ w n q\ < 2m and u'q eP'C {Pq 1 )q. 
Since u G PT*, we obtain v G PT* . By choice of x, we have v'q G P'/ 1 * C PT*. Since t<; n is 
a suffix of 5, we conclude v'q G W\P*W2 ■ ■ ■ r*w n . □ 

Theorem 22. Let L C P+. T/ie following assertions are equivalent: 
1. L is definable in BJCi[<, +1, min]. 

L is a Boolean combination of monomials w\r* ■ ■ ■ w n r* . 
3. Synt(L) G Bi and the syntactic homomorphism hi : P + — > Synt(L) has the property 
that hi(L) is a union of IZ-classes. 

Proof. The equivalence "[T]44>[2]' follows from Theorem 1101 

"[2]=>[3]': We have Synt(L) G Bi by Theorem [131 an d ^l(L) is a union of "/^-classes by 
Theorem 5]. 

't3]=>[2]': By Lemma [211 there exists m G N such that hi{u) IZ hi,(v) if u and v are 
contained in the same languages of the form w\P* ■ ■ ■ w n r* with \w% ■ ■ ■ w n \ < m. Therefore, 
for each 7£-class R of Synt L (L), the language hJ j l {R) is a Boolean combination of languages 
w\r* ■ ■ ■ w n r* with \w\ ■ ■ ■ w n \ < m. The claim follows, since L is a union of languages of the 
form hi 1 (R). □ 
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There also is a left-right dual of the above theorem: A language L is definable in BZi[<, +1, max] 
if and only if L is a Boolean combination of monomials r*vo\ ■ ■ ■ r*w n if and only if Synt(L) G 
Bi and hi(L) is a union of /^-classes. Next, we consider the fragment MSi[<, +1] with neither 
min nor max. 

Lemma 23. Let h : P + — > S G Bj. If u,v G P + are contained in the same monomials 
r*w 1 r* ■ -w n r* of degree \wi ■ ■ ■ w n \ < 12 \S\ 2 , then h(u) J h(v). 

Proof. This proof is only a slight variation of the proof of Lemma |2~T1 We write u = m v, if u 
and v are contained in the same monomials u>iP* ■ ■ ■ r*w n of degree \w% ■ ■ ■ w n \ < m. Analo- 
gously, we write u ~ m v, if u and v are contained in the same monomials r*wir* ■ ■ ■ r*w n r* 
of degree \wi ■ ■ ■ w n \ < m. If u = m v for m = 4|5| 2 — 1, then by Corollary [19] we have 
h(u) = h(v). 

Let u ~3 m v. We want to show h{u) J h(v). We can assume \u\ , \v\ > 3m, because 
otherwise u = v. Let u = pu'q with \p\ = \q\ = m. Consider the factorization v = spv'qx such 
that sp is the shortest prefix of v admitting p as a factor and qx is the shortest suffix of v 
admitting q as a factor, i.e., v is factorized at the first occurrence of p and the last occurrence 
of q. This factorization exists, since u € r*pr*qr* 3 v. We claim u = m pv'q and therefore, 
h(v) <j h(pv'q) = h{u). Symmetry then yields h{u) J h(v). 

We now prove the claim. First, let pv'q € P for P = w\r*W2 ■ ■ ■ r*w n with w\ ■ ■ ■ w n < m. 
Then v € r*PT* and u € r*PT* . Since w\ is a prefix of p and w n is a suffix of q, we conclude 
u £ P. 

Next, suppose u G P with |tjUi ■ ■ • ty n | < m. By Lemma [201 and its left-right dual, there 
exists a monomial P' = v±r*V2 • • • r*v n with \v\ ■ ■ ■ v n \ < \pwi ■ ■ ■ w n q\ < 3m and u = pu'q G 
P' C p(p~ 1 Pq~ 1 )q. Since n G p*p'p* ^ we obtain t> G r*PT*. By choice of s and x, we 
have pu'g G p*p'p* c P*PP*. Since u;i is a prefix of p and ui n is a suffix of q, we conclude 
pw'g G w\r*W2 ■ ■ ■ r*w n . □ 

Theorem 24. Le£ L C P + . T/ie following assertions are equivalent: 
1. L is definable in B27i[<, +1]. 

L is a Boolean combination of monomials r*w\ ■ ■ ■ r*w n r* . 
3. Synt(L) G Bi and the syntactic homomorphism hi '■ P + — > Synt(L) has the property 
that hi(L) is a union of J' -classes. 

Proof. The equivalence "[T]44>[2]' follows from Theorem 1111 

"[2]=>[3]': We have Synt(L) G Bi by Theorem [131 an d hi(L) is a union of 7£-classes by [TJ 
Theorem 7]. 

't3]=>[2]': By Lemma [23J there exists m G N such that Hl(u) J ht,(v\ if it and v axe 
contained in the same languages of the form P*u;iP* • • • w n r* with \w\ ■ ■ ■ w n \ < m. Since 
hi(L) is a union of J'-classes, the language L is a Boolean combination of languages of the 
form r*w\ ■ ■ ■ r*w n r* of degree \w\ ■ ■ ■ w n \ < m. □ 

The following decidability result is an immediate consequence of our characterizations. 

Corollary 25. Let L C P+ be a regular language. It is decidable whether L is definable in 
B27l[<,+1] (resp. B27l[<, +1, min], resp. MEt[<, +1, max]). 

Proof. The syntactic homomorphism Hl '■ P + — > Synt(L) of L is effectively computable. 
Hence, one can verify whether property 't2F in Theorem 1241 (resp. '0' in Theorem [22J resp. the 
left-right dual of '0' in Theorem E2J holds. □ 
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Table 1: Languages around dot-depth one. 

7 Summary 

We considered subclasses of languages with dot-depth 1/2 and of languages with dot-depth 1. 
These subclasses admit counterparts in terms of fragments of existential first-order logic S\ and 
its Boolean closure For all fragments, we give effective algebraic characterizations. At 

closer look, the characterizations are a conjunction of an algebraic and a topological property, 
cf. [7]. We summarize our main results in Table[T] To shorten notation, we write B l7 / 2 instead 
of lx"yx w <x% 

In addition, we give new proofs for Pin and Weil's Theorem on dot-depth 1/2 and for 
Knast's Theorem on dot-depth 1. The proofs are combinatorial and they improve the bounds 
involved in computing a language description for a given recognizing semigroup. 
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